GPU: Migrate buffers on GPU project, pre-emptively flush device local mappings #6794

riperiperi · 2024-05-10T22:30:04Z

Essentially retreading #4540, but it's on the GPU project now instead of the backend. This allows us to have a lot more control + knowledge of where the buffer backing has been changed and allows us to pre-emptively flush pages to host memory for quicker readback. It will allow us to do other stuff in the future, but we'll get there when we get there.

Similar to #4540 , this should only affect performance on dedicated GPUs, not integrated or mobile. There's a lot of work done here that could help separate host imported buffers from copied ones in future, but that's not really important right now. The target is mainly NVIDIA, I don't know how this will affect AMD (hopefully not negatively).

Performance greatly improved in Hyrule Warriors: Age of Calamity (v1).
Performance notably improved in TOTK (average).
Performance for BOTW restored to how it was before Vulkan: Device map buffers written more than flushed #4911, perhaps a bit better.
Performance for newer Pokemon games also improved. This might have regressed with Vulkan: Device map buffers written more than flushed #4911 but nobody noticed because they are bad.

Extra

Rewrites a bunch of buffer migration stuff. Might want to tighten up how dispose stuff works.
Fixed an issue where the copy for texture pre-flush would happen after the syncpoint.
- May fix an issue where character lighting was sometimes broken on Splatoon 3.

TODO stuff

Do a lot of testing (i've only tested on nvidia desktop)
Less permanent effect for storage on fragment. Right now it forces the buffer to be device local, but it's possible that may change in the future. I'd prefer if it just had a very potent effect for keeping it device local or device local mapped.
~~- Remove the "Auto" mode from Vulkan, as it isn't used anymore.~~

… mappings Essentially retreading Ryujinx#4540, but it's on the GPU project now instead of the backend. This allows us to have a lot more control + knowledge of where the buffer backing has been changed and allows us to pre-emptively flush pages to host memory for quicker readback. It will allow us to do other stuff in the future, but we'll get there when we get there. Performance greatly improved in Hyrule Warriors: Age of Calamity. Performance notably improved in TOTK (average). Performance for BOTW restored to how it was before Ryujinx#4911, perhaps a bit better. - Rewrites a bunch of buffer migration stuff. Might want to tighten up how dispose stuff works. - Fixed an issue where the copy for texture pre-flush would happen _after_ the syncpoint. TODO: remove a page from pre-flush if it isn't flushed after a certain number of copies.

github-actions · 2024-05-11T00:24:24Z

Download the artifacts for this pull request:

Old GUI (GTK3)

GUI-less (SDL2)

Only for Developers

gdkchan · 2024-05-11T20:18:02Z

Just for the record, this is my test on Zelda Tears of the Kingdom (on Ryzen 9 7900X and RTX 3060):
master:

PR:

(Around +30 fps on title screen, +10 ingame).

gdkchan · 2024-05-11T22:11:53Z

Splatoon 3 shadow issue seems to be fixed.
master:

PR:

src/Ryujinx.Graphics.Gpu/Memory/BufferCache.cs

GamerzHell9137 · 2024-05-12T00:06:15Z

Tests done on the Ryzen 3600.

Monster Hunter Rise - Master 27 FPS / PR 30 FPS

Note

Cleans up the remaining fps drops in more stressed areas of the game.

Catherine - Master 72 FPS / PR 92

Note

FPS in general is higher but the more important thing is that its increasing the FPS lows from under 30 to over 30 FPS making it speed wise playable.

Littl3Guy · 2024-05-13T17:29:17Z

Slight performance improvement in Pokemon Scarlet, totk is still performing badly for different reasons (RDNA2) so no improvement there.
Master:

PR:

I didn't notice any regressions in other games.

lostromb · 2024-05-16T21:05:10Z

Testing in amd 5500XT - Windows - Vulkan.
FPS are about the same in Hyrule Warriors, Pokemon Scarlet, and Pikmin 4. Nothing remarkable there.

There may be a minor regression in Star Ocean 2

Branch	FPS Mean	Median	p99%	Frametime StdDev ms	Stutter %
migration-rewrite	52.67	53.49	41.31	1.48	0.00
master	56.06	56.49	48.61	0.75	0.00

This is very unlikely, but it's important to cover loose ends like this.

src/Ryujinx.Graphics.Gpu/Engine/MME/MacroHLE.cs

src/Ryujinx.Graphics.Gpu/Engine/Threed/ComputeDraw/VtgAsComputeState.cs

src/Ryujinx.Graphics.Gpu/Memory/Buffer.cs

src/Ryujinx.Graphics.Gpu/Memory/BufferBackingState.cs

gdkchan · 2024-05-18T00:59:01Z

src/Ryujinx.Graphics.Gpu/Memory/BufferStage.cs

+        StorageWrite = 0x80,
+
+#pragma warning disable CA1069 // Enums values should not be duplicated
+        StorageAtomic = 0xc0


Would the compiler stop complaining if you set it to:

StorageAtomic = StorageRead | StorageWrite

?

No, it still doesn't like that. These aren't really like flags anyways, it's more like a sequential enum shifted up.

FWIW I just noticed that in other places, we use this attribute:

[SuppressMessage("Design", "CA1069: Enums values should not be duplicated")]

src/Ryujinx.Graphics.Gpu/Memory/BufferStage.cs

src/Ryujinx.Graphics.Vulkan/BufferHolder.cs

gdkchan · 2024-05-18T01:06:09Z

src/Ryujinx.Graphics.Vulkan/VulkanRenderer.cs

+            else
+            {
+                memoryType = Vendor == Vendor.Nvidia ?
+                    SystemMemoryType.DedicatedMemorySlowStorage :


I wonder why only NVIDIA has "slow storage".

gdkchan · 2024-05-18T18:19:55Z

src/Ryujinx.Graphics.Vulkan/BufferHolder.cs

-        private int _flushCount;
-        private int _flushTemp;
-        private int _lastFlushWrite = -1;
+        private readonly BufferAllocationType _baseType;


This can be removed too since its unused. Or is there a reason for keeping it?

Nothing right now I think, but it will still contain Auto when a buffer is auto selected as HostMapped, so it could be useful for debug.

gdkchan

lgtm, thanks! I tested a few games here and found no issues. It's nice to see buffer migration reworked, and the backend getting simplified too.

lostromb · 2024-05-20T21:37:11Z

Final perf numbers, tested in Windows 11 AMD 5500XT and Nvidia Titan X
Astral Chain seems to improve on AMD. TOTK seems to improve on Nvidia.
No other clear movers in this set of games. Testing was limited because my monitor caps at 60hz

Game	Branch	FPS Mean	Median	p99%	Frametime StdDev ms	Stutter %
Tears of the Kingdom	AMD Master	41.35	41.99	34.42	1.46	0.00
Tears of the Kingdom	AMD PR	39.82	40.45	31.22	2.11	0.00
Tears of the Kingdom	Nvidia Master	46.50	48.57	29.82	2.67	0.00
Tears of the Kingdom	Nvidia PR	50.32	52.26	38.57	1.99	0.00
Astral Chain	AMD Master	56.67	57.06	45.22	1.42	0.00
Astral Chain	AMD PR	59.93	61.30	41.83	2.49	0.00
Star Ocean 2	AMD Master	52.71	53.62	41.66	1.41	0.00
Star Ocean 2	AMD PR	53.49	55.48	39.22	1.89	0.00
Star Ocean 2	Nvidia Master	78.26	78.96	65.97	0.71	0.05
Star Ocean 2	Nvidia PR	77.84	78.60	65.59	0.59	0.00

jfrankpax8 · 2024-05-23T19:56:11Z

Just for the record, this is my test on Zelda Tears of the Kingdom (on Ryzen 9 7900X and RTX 3060): master: PR: (Around +30 fps on title screen, +10 ingame).

I hunger for this performance on my macbook air!

riperiperi added 4 commits May 10, 2024 20:09

Add copy deactivation

5c98f51

Fix dependent virtual buffers

e8882d5

Remove logging

b671c32

riperiperi added gpu Related to Ryujinx.Graphics performance Performance issue or improvement labels May 10, 2024

github-actions bot added graphics-backend:vulkan Graphical bugs when using the Vulkan API graphics-backend:opengl Graphical bugs when using the OpenGL API labels May 10, 2024

Fix format issues (maybe)

5bc01d7

gdkchan reviewed May 11, 2024

View reviewed changes

src/Ryujinx.Graphics.Gpu/Memory/BufferCache.cs Outdated Show resolved Hide resolved

riperiperi added 2 commits May 12, 2024 17:42

Vulkan: Remove backing swap

03ee8f3

Add explicit memory access types for most buffers

b7ea369

riperiperi added 3 commits May 13, 2024 20:48

Fix typo

72e5166

Add device local force expiry, change buffer inheritance behaviour

5d0097f

General cleanup, OGL fix

7234f37

riperiperi added 3 commits May 17, 2024 20:43

BufferPreFlush comments

0e11ee3

BufferBackingState comments

c075f4c

Add an extra precaution to BufferMigration

2779964

This is very unlikely, but it's important to cover loose ends like this.

riperiperi marked this pull request as ready for review May 17, 2024 23:16

ryujinx-mako bot requested a review from a team May 17, 2024 23:16

gdkchan reviewed May 18, 2024

View reviewed changes

Address some feedback

5394bbc

ryujinx-mako bot requested a review from a team May 18, 2024 12:40

Docs

fd529f0

gdkchan reviewed May 18, 2024

View reviewed changes

gdkchan approved these changes May 19, 2024

View reviewed changes

gdkchan merged commit eb1ce41 into Ryujinx:master May 19, 2024
12 checks passed

gdkchan mentioned this pull request Jun 2, 2024

[Bug] Graphical regression in Splatoon 3 on version 1.1.741 and onwards #5239

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GPU: Migrate buffers on GPU project, pre-emptively flush device local mappings #6794

GPU: Migrate buffers on GPU project, pre-emptively flush device local mappings #6794

riperiperi commented May 10, 2024 •

edited

Loading

github-actions bot commented May 11, 2024 •

edited

Loading

gdkchan commented May 11, 2024

gdkchan commented May 11, 2024 •

edited

Loading

GamerzHell9137 commented May 12, 2024

Littl3Guy commented May 13, 2024

lostromb commented May 16, 2024

gdkchan May 18, 2024

riperiperi May 18, 2024

gdkchan May 19, 2024

gdkchan May 18, 2024

gdkchan May 18, 2024

riperiperi May 19, 2024

gdkchan left a comment

lostromb commented May 20, 2024

jfrankpax8 commented May 23, 2024

GPU: Migrate buffers on GPU project, pre-emptively flush device local mappings #6794

GPU: Migrate buffers on GPU project, pre-emptively flush device local mappings #6794

Conversation

riperiperi commented May 10, 2024 • edited Loading

TODO stuff

github-actions bot commented May 11, 2024 • edited Loading

gdkchan commented May 11, 2024

gdkchan commented May 11, 2024 • edited Loading

GamerzHell9137 commented May 12, 2024

Monster Hunter Rise - Master 27 FPS / PR 30 FPS

Catherine - Master 72 FPS / PR 92

Littl3Guy commented May 13, 2024

lostromb commented May 16, 2024

gdkchan May 18, 2024

Choose a reason for hiding this comment

riperiperi May 18, 2024

Choose a reason for hiding this comment

gdkchan May 19, 2024

Choose a reason for hiding this comment

gdkchan May 18, 2024

Choose a reason for hiding this comment

gdkchan May 18, 2024

Choose a reason for hiding this comment

riperiperi May 19, 2024

Choose a reason for hiding this comment

gdkchan left a comment

Choose a reason for hiding this comment

lostromb commented May 20, 2024

jfrankpax8 commented May 23, 2024

riperiperi commented May 10, 2024 •

edited

Loading

github-actions bot commented May 11, 2024 •

edited

Loading

gdkchan commented May 11, 2024 •

edited

Loading